SynthTIGER: Synthetic Text Image GEneratoR Towards Better Text Recognition Models

نویسندگان

چکیده

For successful scene text recognition (STR) models, synthetic image generators have alleviated the lack of annotated images from real world. Specifically, they generate multiple with diverse backgrounds, font styles, and shapes enable STR models to learn visual patterns that might not be accessible manually data. In this paper, we introduce a new generator, SynthTIGER, by analyzing techniques used for synthesis integrating effective ones under single algorithm. Moreover, propose two alleviate long-tail problem in length character distributions training our experiments, SynthTIGER achieves better performance than combination datasets, MJSynth (MJ) SynthText (ST). Our ablation study demonstrates benefits using sub-components guideline on generating models. implementation is publicly available at https://github.com/clovaai/synthtiger.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Better Text Understanding Through Image-To-Text Transfer

Generic text embeddings are successfully used in a variety of tasks. However, they are often learnt by capturing the co-occurrence structure from pure text corpora, resulting in limitations of their ability to generalize. In this paper, we explore models that incorporate visual information into the text representation. Based on comprehensive ablation studies, we propose a conceptually simple, y...

متن کامل

Text Matching as Image Recognition

Matching two texts is a fundamental problem in many natural language processing tasks. An effective way is to extract meaningful matching patterns from words, phrases, and sentences to produce the matching score. Inspired by the success of convolutional neural network in image recognition, where neurons can capture many complicated patterns based on the extracted elementary visual patterns such...

متن کامل

Generating Synthetic Data for Text Recognition

Generating synthetic images is an art which emulates the natural process of image generation in a closest possible manner. In this work, we exploit such a framework for data generation in handwritten domain. We render synthetic data using open source fonts and incorporate data augmentation schemes. As part of this work, we release 9M synthetic handwritten word image corpus which could be useful...

متن کامل

Named Entity Recognition in Persian Text using Deep Learning

Named entities recognition is a fundamental task in the field of natural language processing. It is also known as a subset of information extraction. The process of recognizing named entities aims at finding proper nouns in the text and classifying them into predetermined classes such as names of people, organizations, and places. In this paper, we propose a named entity recognizer which benefi...

متن کامل

A Uniied Approach towards Text Recognition

In our recent research, we found that visual inter-word relations can be useful for diierent stages of English text recognition such as character segmentation and postprocessing. Diierent methods had been designed for diierent stages. In this paper, we propose a uniied approach to use visual contextual information for text recognition. Each word image has a lattice, which is a data structure to...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2021

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-030-86337-1_8